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ABSTRACT 


The MT system at IIT Kanpur uses paninian framework. It 
consists of a morphological analyzer, local word grouper and a 
core parser collectively referred to as the parser. The core 
parser uses karaka charts which specify the syntactico-semantic 
relations, called the karaka relations, between verb and noun 
groups Cal so known as demand groups and source groups 
respect! velyD . The parsing problem should satisfy certain general 
constraints in addition to the above specified constraints. This 
thesis will describe the complexity aspects of the system with 
these constraints. 

Two approaches are taken to get the complexity of the parsing 
problem. In the first approach the parsing problem was reduced to 
already will known problems. There are two types of karakas, 
mandatory and optional. With some constraints the problem was 
reduced to bipartite matching problem and With the constraint 
which ensures that all mandatory karakas must be filled the I 
problem was reduced to maximum matching. Finally with some | 
additional constraints the problem was reduced to Min cost flow 
problem. 

i' 

in the second approach the parsing problem was formulated as | 
integer programming problem. With some of the constraints the I 
matrix of the turns out to be uni modular and totally uni modular. ' 
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CHAPTER 1 


INTRODUCTIOM 


Translation is the process of converting text in one 
language to another language such that it-s meaning is preserved. 
If that process is done using a computer, it is called Hachine 
Translation Cref erred to as HTD. Translation is difficult 
because each language has its idiosyncrasies like 

1. words are overloaded to represent multiple concepts 

2. sentence constructions may vary in the source and target 
1 anguages. 

So to capture the essence of a sentence the use of a proper 
intermediate representation is essential. 

The Machine Translation problem is dependant on the 
choice of the internal representation and strategies for the 
internal representation construction from source text. The MT 
problem after the choice of internal representation can be divided 
into two parts. 

1. Parsing and 2. Generation. 

Parsing is the process of assigning a suitable structure 
called the parse structure that captures the internal 
relationships between words in the given source sentence. 

Generation is the process of constructing the target 
language sentence given the parse structure. 
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1.1 Paradigms of MTt 

Thor© ar© ihr©© different- st-rat-egies based on t-he amount- 
of i nf ormat-lon present- in t-he pars© st-ruct-ur© and t-h© manner in 
which t-h© generat-or uses it-. These strategies are used by 
different- MT systems. 

Direct- MT st-rat-egy: In this strategy, no general linguist-ic 

theory or parsing principles ar© necessarily present. The system 
relies on well developed dictionaries, morphological analysis and 
text processing software to gain credible translations of the 
source language text into a series of reasonably equivalent words 
and phrases in the target language. The Georgetown system, to 
translate English to Russian, adopts this strategy. 

Transfer MT strategy: In this strategy, a source language text is 
parsed into an abstract internal representation. A transfer is 
made at both the lexical and structural levels to the target 
language and the translation is generated. For this, source 
language, target language, bilingual lexicons are required. Source 
language lexicon is used in parsing. The levels at which the 
transfer occurs, differs from system to system, ranging from 
syntactic deep structure markers to syntactico— semantic 
r epr ©sent ati ons . 

Inter 'lingua MT strategy: In this strategy, the source text is 

parsed and is mapped to a languagefree conceptual representation. 
The language used to represent the conceptual information is known 
as the inter lingua. Inference mechanisms then apply contextual and 
world knowledge to augment the representation. Finally the 
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generator ' maps the appropriate sections of the languagefree 
representation to target language. 

1.2 IIT-Kanpur MT system t 

It uses the third strategy specified above, i . e. , 
interlingua approach. The parse structure encapsulates as much 
conceptual information as required for the purpose of translation. 
The parse structure is centered around the verbal groups of the 
sentence and is based on the karaka relations. The karaka 
relations are syntactico-semantic relations between the verbal 
groups and the source groups in the sentence. The parse structure 
is a mapping between the source groups and the kauraka relations. 
Besides the karaka relations the sentence may contain the 
non -karaka relations such as those contributed by the adjectival 
relations, purpose and relational words, etc. The karaka relations 
are used for the disambiguation of word senses. Thus the parse 
structure contains a mapping between the source groups and the 
karaka relations along with the concepts of the various words in 
the text . 

1.3 Constraints and Complexity aspects of Parsers 

As mentioned earlier the parser is a main part in the 
machine translation. So the time taken to translate a source 
language sentence to a target language sentence is dependent on 
the time taken to form a parsed structure. 
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The MT system uses a constraint parser > which will use 
constraints to get suitable candidates for karaka roles of verb 
groups. To get a unique parse structure, in addition to the gnp 
C gender , number , per sonD filter, karaka— vibhakti filter and semantic 
filter Cfeature constraintsD , six additional constraints are 
introduced C which are applicable to natural languages^. These 
constraints will ensure, to a certain degree, the unique parse 
structure. 

The core parser was implemented using integer 
programming. The constraints are formulated as integer programming 
equations. But the parsing problem complexity is not known, 
because the problem complexity may not be equal to the algorithm' 
conqslexity. It may be less than or more than the algorithm 
complexity. This thesis will describe the complexity aspects of 
the parsing problem. 

Two approaches are used to study the complexity aspects 
of the parsing problem. 

In the first one, the parsing problem with some of the 
constraints was reduced to the other known problems, like matching 
problem, mincost flow problem. In the second approach the problem 
with all the six constraints was formulated as an integer 
programming problem and the total unimodularity and unimodularity 
conditions were tested on the coefficient matrix of the Integer 
programming formulation. 

The complexity aspects of the system with category 
clashes and K merged karaka charts for each verb group are also 
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discussed. 


1.-4 outline of the Thesist 

The second chapter describes the details of the MT 
system and the constraints. The third chapter describes the 
con^slexity aspects of the system. Conclusions and pointers for 
further enhancements follow in the last chapter. 


t 
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CHAPTER 2 


PAHINIAN FRAMEWORK FOR NLP & CONSTRAINT FORMULATION 


2. 1 Overview of t-tie KT sysLentt 

The flow chart of the MT system is shown in Fig 2.1 



Output Sentence 
C translated sentence;^ * 


Fig 2. 1 Flowchart of the IIT— Kanpur MT system 
Each box is described in detail below. 


2.1.1 Morphological anal 3 rzert 

At this stage the grammatical information of each word 
in a given sentence is obtained. For each word the category it 
belongs to and the feature values associated with the category are 
obtained. 





The categories are language dependant. For Hindi, 
different categories are Noun, Verb, Adjective etc. Different 
features associated with categories are 

1 . Gender , Number . Person for Nouns 

2. Gender, Number, Person and part of TAM C Tense, Aspect 
andModal i t yD for Ver bs . 

CIn case of Telugu Verbs, complete TAM label is obtained. D 

The Morphological analyzer retrieves, for each word, to 
which category the word belongs. What are the Gender, Number and 
Person values if the word is a Noun, what are the Gender, 
Number , Person and TAM values if the word is a verb. 

If the word belongs to more than one category C category 
clashD or if the word has more than one set of feature values then 
the morphological analyzer will return information for each of 
them. 

2,1.2 Local Word Gh*oupert 

Though Indian languages are relatively word order free, 
some units follow word order like main verb is followed by 
auxiliary verbs. Nouns are followed by Vibhakti's etc. Such units 
combine into groups based on local information in the local word 
grouping stage. Vibhakti is the word combined with a Noun in the 
Noun group. A noun group may contain a 0-vibhakti Cnull string as 
vibhakti, i . e. , No vibhakti^ 
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exampl e 1 : 


Ram Pal a ko fCALA hE 
CRam eats the fruits 

In the above sentence *RAM*, 'Pala*, ’ ko' , ’ KAtA’ , ’ hE’ 
are given \^rcls. Morphological analyzer retrieves information for 
each of them. Here * Ram*,* Pal a' are Nouns, *KAtA* is main verb, 
*hE* is auxiliary verb, ’ko’ is vibhakti. The local word grouper 
will form word groups from these words. The noun *Pala* and 
vibhakti 'ko* are combined to form a noun group. The main verb 
'KAtA* and the auxiliary verb *hE* are combined to form a verb 
group. Such grouping is done at the local word grouping stage. In 
the example 1 'Ram' is a noun group with O— vibhakti. 

In the local word grouping, the verb grouping is done 
based on possible verb sequences that may occur and information 
about agreement, the noun groups are composed on the form of the 
noun and the following vibhakti Cp>ost px>sition marker^. However, 
in those cases where there is ambiguity in identifying the local 
word groups and the ambiguity cannot be resolved at that stage, 
the decision is postponed. For exaiiq;>le. In Hindi , a word that can 
be both a noun and an adjective causes ambiguity in forming a 
local word group with its succeeding noun. 

The noun groups and verb groups are also known as source 
groups and demand groups. 

The phenomenon of local word grouping occurs to a 
greater extent in languages like Hindi where as the declension 
takes the form of a postposition and the auxiliary verbs are 
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separa’tod by word boundaries. In languages such as Telugu» local 
word grouping occurs to a lesser extent, since a word on most 
occasions, is morphologically inflected. 

Let ^ given input sentence, where each 

x^ is a word group. 

If w. ,w^ w are verb groups 

and Sj^jSg s^ are noun groups 

C where m+n >= p? 

and CLj^,Wj3 is the lexical information of the i th 
meaning of word group j then the local word grouper will 

r etur n 

< CL^ w^ , LgW^ . . . . D , Wg, L-gWg. ... 3 

^H'^n"^2'*'n ^ ^ 

and < CLj^Sj^ , LgSj^ . . . . 3 , CLj^S 2 ,LgS 2 . . . . ;? 

CL. s ,L.~s . . . . D > 

X in (C 111 


2* 1 . 3 Core Parser; 

If a word group belongs to both source group and demand 
group, local word grouper will give both in formations. If 
category clashes C a word group belongs to both source group and 
demaj^d groups are there, before using the local word grouper 
output for the parser algorithm, only one set of information is 
taken with each word. If parse structure is not generated then the 
parser algorithm has to execute again with a different set of 


information. 



The complexity of the system increases if every word 
group belongs to every category. If K clashes C K word groups 
having category clashes!) are there then the constraint solver 
Ccore parser^ has to be executed 2 times Cnumber of groups taken 
as 2D in the worst case. The conqslexity aspects are discussed in 
chapter 3. 

The task of the core parser is to identify karaka 
relations among word groups. It requires the karaka charts. There 
is a separate karaka chart for each verb group in the sentence 
being processed. 

2. 1 . 3* 1 Karaka charts! 

The karaka chart of a verb group will store information 
about the different karaka’ s associated with it. They are adso 
termed as karaka roles of the verb groups. The noun groups are to 
be assigned to these karaka roles of the verb groups. The verb 
groups are called the demand groups as they make demands about 
their karakas, and the noun groups are called source groups 
because they satisfy such demands. CA verb group can be a source 

group as well when it satisfies the demand of another verb group. 

\ 

Thi s however , does not affect i ts status as a demand gr oup ais 
wel 1 . 

With each karaka» karaka restrictions are also stored in 
the karaka chart. There are 3 different restrictions. 

1. Optionality of karakas 

2. Karaka-vibhakti mapping 

3. Semantic type information 
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2. 1.3. 1.1 Opt.lonalit.y of karakast 

For each karaka role t^hls field will specify ^^iieiher 
Lhat karaka is Mandatory or Optional. If it is mandatory then 
there must be an assignment for that karaka role, i.e. , some source 
group must be assigned to that karaka. If it is optional then the 
karaka may or may not be filled. Cthis constraint is discussed in 
section 2. 1.4. 

2. 1 . 3. 1 . 2 Karaka-vibhakti mappingt 

E^ch source group CNoun groups will have a vibhakti Ccan 
be a 0-vibhakti3 with it. Each karaka will specify some vibhakties 
Cagainst each karaka the acceptable vibhakties are given in the 
karaka chart, exan^^le karaka chart is shown in Fig 2.33. Source 
groups having those vibhakties must be assigned to that karaka 
role. This constraint is also termed as Feature constraint, 
because it is restricting the feature aspect of source group. 

2. 1 . 3. 1 . 3 Semantic typet 

This information is limited to that necessary for 
removing ambiguity, if any, in karaka assignment. In other words, 
for a given verb, when karaka-vibhakti mapping is not sufficient 
for producing a parse structure, semantic types are included. The 
semantic types included have the sole purpose of karaka 
disambiguation. This keeps the number of semantic types under 
control, and serves as a guiding philosophy for what semantic 
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t.yp>es to include. The Fig 2.2 shows a possible semantic type 
hierarchy which is sufficient for a major part of language. 


animate inanimate 

human^ non ''human etc. 


etc. etc. 


Fig 2.2 Semantic type hierarchy 
It cannot be a tree always. Some times it will be a 

graph. 

An example karaka chart for KA CeatD is given in 

Fig2. 3. 


karaka optional! ty vibhakti semantic type 


karta 

m 

0 

animate 

karma 

m 

0 or ke 

— 

kar ana 

o 

se 



Fig 2.3 karaka chart of iCA CeatD 
The gender, number and person agreement between verb 
groups and noun groups is verified in the GNP filter of the core 
p>arser C language grammar rules are used for the agreements. 

A verb group can have more than one karaka chart. In 
that case with one selection the core parser has to be executed. 
If parse structure is not generated then with another karaka chart 
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selection the core parser algorithm has to be executed. If there 

are n verb groups and each verb group is having kk^Ci=l to nD 

karaka charts then the core parser algorithm has to be executed 

k. **k_». . . . k times in the vwarst case. For that reason all the 
X G n 

karaka charts of a verb group are merged. The merging procedure is 
given below. 

The union of karakas of all karaka charts of the verb 
group will be taken in the merged karaka chart. The karaka’s which 
are mandatory in all karaka charts of that verb group will remain 
mandatory in the merged karaka chart. All remaining karakas will 
be optional in the merged karaka chart. For the remaining 
fields* the union of all field values will be taken. 

The merged karaka chart so formed is used in the core 
parser algorithm. The other asp>ects oif merged karaka chart are 
discussed in chapter 3. 

2. 1 . 3. 2 Karaka Relations among word groups! 

For a given sentence after the word groups have formed* 
karaJca charts for the verb groups are identified Cfirst two 
phases!) and each of the noun group>s is tested against the karaka 
restrictions in each karaka chart Cprovided the noun group is to 
the left of the verb group is to the left of the verb group whose 
karaka chart is being tested.!). When testing a noun group against 
a karaka restriction of a verb group* vibhakti information and 
semantic type are checked and if found satisfactory, the noun 
group becomes a candidate for the karaka of the verb group. This 
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can be shown in the form of a constraint graph. Nodes of the graph 
are the word groups and there is an arc from a verb group to a 
noun group labelled by a karaka, if the noun group satisfies the 
karaka restriction in the karaka chart of the verb group CThere 
is an arc from one verb group to another verb group, if the karaka 
chart of the former has a karaka restriction with semantic type as 
actiorO. The verb groups are called demand groups as they were 
demands aibout their karakas, and the noun groups are called source 
groups because they satisfy demands CA verb group can be a soxirce 
group as well when it satisfies the demand of another verb group::). 

As an exan^le, consider a sentence containing the verb 
KA CeaO with its word groups marked. 
exaiiq>le 2: baccA kele ko KAtA hE 
child banana -ko eats 
CThe child eats the banana^ 

Its constraint graph is shown in Fig 2. 4. 


baccA kele ko 

s. 

karta " — _ 


KAtA hE 

. ■<' / 


karma ^ '' 


Fig 2. 4 Constraint graph of exaixqple 2 sentence. 

It also happens to be the solution graph because all 
source groups are assigned. 

Consider another sentence where the constraint graph is 
different from the solution graph. 

baccA kelA KAtA hE 

C chi Id eats bananaD 
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Here both names qualify to be kart a and karnoa Cin the 
previous exaixq>le 2 vibhakti *ko* in source group ’kele ko’ qualified 
it to karma. 3. In such a situation, the parser produces both 
parses. To get a unique parse structure some additional 
constraints are applied. 

For exanq^le, if we put karta as animate then the parse 
structure where 'baccA' as karta and 'kelA' as karma will be 
selected. 

Some additional constraints are added to get unique 
parse structure. In terms of the constraint graph a parse is a 
sub-graph of the constraint graph satisfying some constraints. The 
constraints are given in section 2.1.4. 

2. 1 . 4 Constralntss 

1. Prom a demand node, for all mandatory karaka 
labels, out of all edges with a particular mandatory 
karaka label, only one edge has to be selected. 

CA mandatory karaka of a verb group must be filled with 
one source group only. D 

2. Prom a demand node, for all optional karaka labels, 
out of all edges with a particular optional karaka 
label, no edge or only one edge has to be selected. 

CA optional karaka of a verb group must be filled at 
most with one source group only. D 
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3- A source node can be assigned bo only one demand 
node, for all labels, for all demand nodes. 

CA source group must, be assigned bo only one karaka 
role. D 

4. For all demaoid nodes, bhe edges labelled by 
mandabory karakas musb be selecbed firsb. 

CAll mandabory karakas musb be filled.!) 

5. From a demand node, oub of all oubgolng edges wlbh 

a seb of opbional karaka labels, n edges musb be 

selecbed where n <= 1>\ 

COub of hr] opbional karakas, n opbional karakas musb be 
filled. 5 

6. In bhe solubion graph bhere shouldn*b be any arc 
Inbersecblons. Cnesblng consbrainb,) 

2.2 Consbrainb Parser t 

Currenbly a parse is obbalned from bhe consbrainb graph 
using inbeger programming. The consbrainb graph, wlbh addibional 
consbrainbs specified in secbion 2.1.4 is converbed inbo an 
Inbeger programming problem by inbroducing a variable for a edge 
from node i bo J labelled by karaka 'k' in bhe consbrainb graph 
such bhab for every edge bhere is a variable. The variables bake 
bheir values as O or 1. A parse is an assignmenb of 1 bo bhose 
variables whose corresponding edges are in bhe parse sub— graph, 
and O bo bhose bhab are nob. Equal! by and inequaliby consbrainbs 
in inbeger programming problem can be obbained from bhe 
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const-rainLs listed in section 2.1.4. For some of the constraints 


the formation of equality and inequality constraints is given 
bel ow. 

1. For each demand group i, for each of its mandatory 

karakas k, the follovdng equality constraints M must hold. 

MCi.kD : Z x, . = 1 . 

J ^ 3 ^ 

Thus there will be equality constraints MCi,k3 corresponding 
to mandatory karakas for each of the demand words. 

2. For each demand group i , for each of its optional 

karakas k, the following inequalities must hold 

CXi.kZ) : Z X . <= 1. 

J J 

Thus there will be equality constraints OCi,kD corresponding 
to optional karakas for each of the demand words. 

3. Each source group J must be assigned only once 

SCJD : Z X = i: 

ik 

Thus there will be as many equality constraints SCJD as 
the source groups,. 

For the other conditions also the equalities are formulated. 

The core parser flow chart is shown in Fig 2. S. 
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parse s'truc'ture 


Fig 2. B Flow chart of Coro Parser. 

Xh« complexity of the parser with the constraints is 
discussed in th« next chapter. 
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CHAPTER 3 

ALG(»a;THICIC ASPECTS THE C(M!E PARSER 

The core parser flowchart uses integer programming for 
which no polynomial time algorithm is known till now. We study 
here if this particular use of integer programming can be reduced 
to known polynomial time problems e. g. Assignment problem, 
Min-cost flow problem or Linear programming problem. For that 
study the constraints specified in section 2.1.4 are introduced in 
a regular fashion and the results are given. 

3.1 With Constraints 1, 2 and 3t 

For constraints 1, 2 and 3C>diich aure given in 2. 1 . 4D , 

information from local word groups and karaka charts are captured 
in a bipartite graph and the parsing problem is solved by 
bipartite matching algorithm. 

3.1.1. Reduction to Bipartite Graph! 

A bipartite graph GCU,V,EI> is defined as 


U : 

set of 

nodes < u. ,u_, . . 

. .u > 



1 2 

n 

V : 

set of 

nodes < v. ,v_, . . 

. . V > 



1 2 

■ m 

E : 

edges ! 

between U and V 



and the constraint is UnV = 

The reduction is done in three stages. 

1. Formation of U from lexical information of 
source word Groups. 
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2. Forma'tion of V from kanakas of verb groups. 

3. Forma'tion of E from karaka relations among word 

group>s. 

Each one of the three stages are described in detail. 

3. 1 . 1 . 1 Formation of Ut 

Let the local word grouper produce 

C L. S. » 1 I. .S . . ■ a ■ 3 t La S... ^ L .aS^ ....O L. 9 • a * ^ 

1121 1222 im 

where s. >s.^>. . . .s are source groui>s> 

Id m 

.. are lexical meanings, 

and the i th lexical meaning of source word Sj. 

The set U = < u. ,Ua^ u > is formed 

1 ^ m 

where u^^ = <L^s^ ‘*"2*1 • * * * ^ 

~ tL. S.^ , L .a S j^. « • ■ ^ 

2 12 2 2 


u = <La s ,L,^S . . . . > 
m 1 m 2 m 


3. 1 . 1 . 2 Formation of Vt 

Let Cw, ,w„,. . . .w 3 be the verb groups in the sentence. 

i 2 >1 

The karaka charts contain karaka information of these 
verb groups Ci.e. vrtiich karakas are there with a verb groupD. 

Using the karaka charts the output is 


Ck^w^.kgWj. 


k. w. DCk. ..!>...,Ck.w , k^w 

wX X222 XnSri 


where acceptable karakas to the verb groups 

and kj^Wj is the i th karaka role of the verb group Wj . 
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The demand se*t V = <v. , v_. • . . v > is formed 

Id p . 

where v. = k - w. 

1 11: 

''a “ '‘a"i 


V ^ k w 
t t 1 


= k^Vj 


V = k_, 

p ClastD n 


3. 1 . 1 • 3 Fornalion of Edgost 

Edg«s CED are formed Indlca'ting possible karaka 
relations. An edge from Uj^ to Vj is formed if the source word 
group u^ is a candidate for the demand word v^. Candidacy is 
determined by looking at the vibhakti constraint specified by the 
karaka charts* etc. 

For each demand v. acceptable candidates are selected 
applying the following filters. 


Initially all source words to the left of the verb group 
Cin the sentenced of the karaka are candidates. The filters given 
in Fig 3.1. are applied one be one to reduce the set of candidates 
acceptable. The filters can also be termed as feature constraints. 
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<^P agreement Filter 
gender , number , person 


^ 

Vibhakti 

c 

Filter 

^ 

L 

Semantic Filter 


Fig 3.1 Filter Flow chart 

For each Vj the filters are applied on the set -C 
u. ,u^. ...u >. C where -Cu. ,u_. ...u > are source words to the left 
of verb group VjD 

If any u^ is an acceptable candidate then the edge 
is added to the edge set <£>. But here u^ is not single entry 
Csome times]). It can be a set. So the modified version of 
formation of edge set is 

For each Vj 

if any of the is acceptable candidate then 

edge i® added to the edge set <E> 

and label 'kj' is added to the edge. 

In this new version extra information Clabel) is given 
with each edge. 

If more than one of the s^ are acceptable then the 
edge i® added once and k is arbitrarily selected out of the 

satisfying and kj is added. 
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Exampi © 1 : 


Let, Vj be the karaka w© ar© d©aling with. 

. Let ar © th© acceptabl © candi dates i ni ti al 1 y 

C source word groups to th© left of th© verb group VjD 


and CL^s^.L2S^....3CL^S2. 


CL. s , . . . . D ar© the lexical 
1 a 


entries of source word groups. 

If any of CLj^s^^ .LgS^^ , . . . L^s^D is an acceptable candidate 
for Vj then an edge CUj^,Vj!> is added to th© set CE> and label kl 
is added to th© edge Cv^ere Lj^ is the particular lexical entry of 

UiX 


The fornation of set U from the lexical information of 
source word groups can be done in linear time because its Just an 
assignment of u^ to each source word group. 

The formation of set V is linear time because it is also 
assigning to karaka roles of verb groups. 

The formation of Edges is of order oC|U|. |V|D. So th© 
formation of bipartite graph G =: CU»V»ED can be don© in polynomial 
time. 


Now the bipartite graph is formed. This graph is given 
as input to the bip>artite matching algorithm. 


3.1.2 Bipartite Matching Algorithmt 

A matching M of bipartite graph G = CU.V.ED is a subset 
of edges with property that no two edges of M share th© same node. 
The matching problem is to find a maximum matching of G. An 
augmenting path algorithm is given in [papa82]. 
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Th© algor it^hm will return a maximum cardinality 


matching, i . ©. , the cardinality of M = min < |U|. |Vj >, if 

available. 

In the output of the algorithm, the edges in matching 
are like u^Vj. The label on that edge is taken and the proper Lj^ 
is found, then it is substituted instead of u^ and the 

matching. 

The flow chart of the core p>arser with constraints 1, 2 
and 3 is given in Fig 3.2. 



Fig 3.2. Flow chart of the core parser with constraints 1,2 and 3 


3.2. Introduction of Constraint 4s 

I If distinction has to be made between mandatory and 
optional l^rakasCsee constraint 4 in section 2. 1 . 4D the bipartite 
matching algbrithm is not sufficient. For all edges weights O or 1 
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ar© assigned and t,h© bi part,! La matching algorithm is ©xtandad t< 
maximum bipartite matching. 


3. 2. 1 Assigning 0, 1 weights to adgest 

After the graph G is created, for each edge CUj^,VjDc£, 
V. is checked whether mandatory or optional , and weights are addec 


to the edge. 


If 


If 




is mandatory then weight 1 is assigned to 
the edge C , Vj 3 . 

is optional then weight O is assigned. 


3. 2. 2. MaxiMUM Bipartite Matchings 

Given a graph G CU»V,ED a number w^^ greater than or 
equal toO for each edge Cu^.v^le E is to find a matching of G 
with the largest possible sum of weights. This ensures that all 
the mandatory karakas are covered before optional ones are 
considered. 


After assigning weights CO, 13 the graph G is given as 
input to the maximum bipartite matching algorithm Cpapa82] which 
will give a matching ensuring the constraint 4. 

The assignment of weights is a polynomial time algorith 
Its complexity is oCE3. 
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Tho flow chart of the core parser is given in Fig 3. 3. 



parsed structure 

Fig 3. 3. Flow chart of the core parser with Constraints 1 to 4 

In the previous weight assignment method only Oandl are 
assigned. They will ensure constraint 4. If more general weight 
assignment is needed Ce. g. , to specify source word s^ has more 
weight than source word Sgto become a candidate of then the 

weight assignment block is modified. 

If arbitrary weight assignment is done then the 
constraint 4 may not hold, because sometimes the weight of 
optional edges become higher than mandatory edges. So to ensure 
the constraint 4 , weight assignment has to be done carefully. The 
procedure is 


26 




For all tu^,Vj] e E 

if op'tional then weight > O is assigned to that 

edge, i.e. , all edges having optional kanakas are assigned weights 
first. 

Let are the optional kanakas 

Let Is the maxi mum assi gnment to any edge havi ng x^ , 
then 

y 

MM = Z M 
i=l 

For all edges having mandatory items the weight assigned 
to them must be > MM 

If assignment of weights is done by the above procedure 
then constraint 4 is ensured, by that all mandatory karakas will 
be filled. This weight assignment module replaces 0,1 weight 
assignment module in the flow chart of core parser. This blcack is 
of order of complexity oC|E| 3. Sio max. weight matching will remain 
optimized matching. 

3. 3, Introduction of Constraint 5s 

With the constraints l,2,3and4 the parsing problem was 
solved by maximum weight matching algorithm Ci.e. , assignment 
problem!). These constraints assume the capacity of each edge as 1. 
But constraint 5 will ask for capacity of an edge >=1. Ce. g. 3 
optional karakas must be filled in a group of 5 optional karakaD. 
With these constraints the problem no more remains an assignment 
problem, but reduces to min cost flow problem C which is a 
combination of both min cost problem and max flow problenO. 
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3.3.1. Min cost flow Problenu 

Let G=CW,AD be a directed network with a cost c^j and a 

capacity associated with every arcCi»J3 upon whether bCiZ)>0 or 

bCi3 < O. The minimum cost flow problem can be stated as follows. 

Minimize zCxD = T c. . x. . 

Ci.J3«A 

subject to 

Z X. , - Z X.. = bCiD 

<J:Ci,J3€A> <J:Cj,iD€A> 


for all ieN 

O <= x^j <= for all Ci,j3 e A. 

3. 3. 2. Formation of Gt 

Formation of G is done in 2 stages. 

1. Formation of V Cset of nodesD 

2. Formation of E Cset of edges!) 

3. 3. 2.1. Formation of Ws 

There are 5 types of nodes 

1. all source word groups S=CSj^ .Sg. . . . s^D 

2. all demand karakas V=Cv. ,v_. . . . v D 

12 m 

3. source CSSO 

4. tank CT) 

5. intermediate nodes <E> 

The set E is created as follows. If a group of optional 
karakas have the constraint 5, then a node is created and edges 
are created from N^to all nodes in that group of optional demand 
karakas. 
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3. 3. 2. 2. ForMatlon of Et 


Each edge has information about cost of the edges, the 
lower bound of flow and the upper bound of flow 
There are 5 types of edges. 

1. Edges from source to all source word groups for which 

Lower bound of flow * i 
Upper bound of flow = 1 
cost of the edges = O 

2. Edges from source word groups to demand karakas. The 
edges are created by the formation of edges block. 

Lower bound of flow = O 
Upper bound of flow » 1 

cost of edges will be assigned by weight 
assi gnment bl ock . 

3. Edges from demand karakas to intermediate nodes 

Lower bound of flow = O 
Upper bound of flow 1 . 

cost of edges = O 

4. Edges from intermediate nodes to tank 

For these edges the lower bound of flow and upper 
bound of flow will be constraint specifies capacities. 
CPor exanqale if 3 optional karakas out of 6 must be 
filled then 

lower bound of edge = 3 
upper bound of edge = 6D 
cost of edges » o 
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5. edges from demend karakes bo bank: 

lower bound of flow = O if opbional kanaka 

1 if mandabory 
upF>er bound of flow = 1 
cosb of edges = O. 

exanqpl e 2 : 


Leb n^^ n^ n^ n^ Vg be bhe given senbence 

where n^^ ,n^tn^ and n^ are source word groups. 

Leb bhe kanaka charbs of v. and v- be 

1 2 



and bhe consbrainb is, one oub of <2,3> of musb be 

filled. Cin addibion bo consbrainbs l,2»3and4D. 

The seb W is union of <S,V,SS,T,E> 


here S = <nj ,n 2 ,ng,n^> 

* ■'■^ll*^12'''l3*’'^21'^21^ 


The consbrainb S is applicable bo only one group, i . o. , 


Vi 2 andVj^ 3 . So bhere is only one Inbermediabe node. Leb ib bo e^^ , . 
bhen VV = 

Leb every elemenb in S be a candidabe bo every demand kanaka 


in V bhen bhe graph G= CW, AD is as shown in Fig 3.4. 
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Fig 3.4. Graph Gs=CW,AD for example 3. 

The graph G*CW,A> shown in Fig. 3. 4 was crea'ted and 
given as input to the min cost flow algorithm block [Ahuja] and 
the parsed structure is the output. The core parser flow chart is 
given in Fig 3.3. 


karaka chart 
information of 
verb groups 


GNP filter 


vi bhak ti f i 1 ter 


semantic filter 


local word grouper 

F word groups with 

lexical information 
jk 

graph G =CW,E) is 
^ created. 

i ■ 

weight assignment 
block 

min cost flow 
al gor i thm 

I ‘ 

assigning actual 
1 exi cal entr y 

parsed structure 


Fig 3.5. Flow chart of the core parser with constraints 1 to 5 
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In forniaii.ion of G format^ion of in'ter modi ale 

nodes and edges conlalnlng Ihe inlermediale nodes may increase Ihe 
conqplexily because if Ihere are n opllonal karakas Ihen a maximum 
of 2*^— Cn+1> inlermediale nodes will be formed. Bui if Ihe 
following assumpllon 1 is made Ihen Ihe complexily of Ihe 
formalion of graph G problem will slill remain polynomial. 

Assumpllon 1 : The inlerseclion of groups is null. 

Wllh Ihls assumpllon. Ihe number of inlermediale nodes 
will be linear lo Ihe number of karakas. So Ihe formalion of graph 
and Ihe F>*rser algorllhm are polynomial. 

For many nalural language senlences Ihe assumpllon 
holds Cl dldn*l gel any counler exanqple!>. 

Up lo Ihis slage Ihe parsing problem Cwilh conslrainls 
1,2, 3, 4 and 53 was reduced lo min cosl flow problem whose 
coixq^lexlly is polynomial lime. So Ihe p>arslng problem complexily 
wilh conslrainls 1,2, 3, 4 and 5 is polynomial. This can also be 
proved by lolal unimodularily and unlmodularily condi lions on Ihe 
coefficlenl roalrix A of Ihe inleger programming problem. The 
delails of inleger programming, Tolal unimodularily and 
unimodularily are discussed in seclion 3. 4. 

3. 4. Integer Programming Approach] 

Inleger programming prc^lem: Given a ralional roalrix A and 
ralional veclors b and c 

delermlne max < cx | where Ax < b, x inlegral > u.. 
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Total uni modularity: A matrix is said to b© total unimodular if 
each sub determinant is -t-1 , O or -1. 

Theorem; All network matrices are total unimodular. 

Uni modul ar i ty: A matrix is said to be unimodular if the 

determinant of every basis matrix is +1 or -1. 

Let U be a non singular matrix. U is called unimodular if 
U is integral and has determinant ±1 

or every basis matrix of U has determinant ±1. 

Theorem: Every total unimodular matrix is unimodular. 

Each total unimodular matrix arises from either Network 
matrix or the matrices of the type 

1 - 100-1 11111 

-11-10 0 1 1 1 O O 

0-11-10 lOllO 

00 -1 1 -1 10011 

- 100-11 11011 . 

So if a matrix A is totally uni modul ar then the problem 
can be reduced to network problem if the matrix A has not arisen 
from other typ>e of matrices specified above[Schri86] . 

Total unimodularity of a given matrix can be tested in 
polynomial time. 

Therefore* a given matrix can be tested for being a network 
matrix in polynomial time. 

If a matrix A is unimodular then for each integral vector b, 
the polyhedron 

< X I X 2: O ; Ax=b > is Integral. 
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In ot^h«r words 'th» problem ca.n bo solved by linear 
P** OQ** 'f'Q C still getting the integer solutions because the 
polyhedron vertices are all integers^. 

The above conditions C total uni modularity and 

uni modular it y3 are applied to the integer programming problem 

formulated from the parsing problem. The results are discussed in 

the following sections. 

3. 4. 1 With constraints 1 to 5t 

With the constraints 1 to S the parsing problem was 
formulated as an integer programming problem Cdiscussed in 

section 2. 3. D. The coefficient matrix of this formulation, if 

tested for total uni modularity, it will satisfy the test. 

Also if the matrix was tested for network matrix, it will satisfy 
that resulting that the parsing problem can be reduced to min cost 
flow problem. 

3.4*2 With constraints 1 to es 

With constraints 1 to 6 the parsing problem will be 
formulated as an integer programming problem. The coefficient, 
matrix A will be tested for total uni modularity and uni modularity 
conditions. If matrix A satisfies the properties then it can 
be stated that the pmrsing problem can be solved by using either, 
min cost flow problem or linear programming problem whose time 
conqpilexlty is polynomial. So if the matrix satisfies the 
conditions time complexity of the parsing problem will be 
polynomial. 
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Now *n •xampl* is given which will show that matrix A is not 
always uni modular. 

Example 3 : Let n^^ n^ Vg be the given sentence 

where source word groups 

*jr»d demand word groups 

Let the karaJca charts of v. and be 

JL dHt 



Here has 3 karaka roles and all are optional. Vg 
has 1 kairaka role which is optional. 

Let the graph be 



Fig 3.6. Graph of exaiiq>le 3. 


Here each source word is a candidate for every karaka 

role. 
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The i nLeger pr ogr ammi ng f or mul a'ti on wi i^h consir ai nts 1 t.o 6 is 


’^lA 


^A 

+ 


^A 

4 

^4A = ^ 


-I- 

^B 

4 


^B 

4 

^4B = ^ 

^lA 

4- 

^IB 

< 

1 




^A 

+ 

^B 

< 

1 




^A 

+ 

^B 

< 

1 




^4A 

4- 

^4B 

< 

1 




^lA 

- 


- 



< 

1 


- 

^IB 

- 


^B 

< 

1 

^A 

- 

^IB 

- 


^B 

< 

1 


X^J = 0 or 1. 


After adding the slack variables the coefficient matrix of 
this integer programming problem is 


^lA ^A ^4A ^B ^B 
1 1 1 1 O O O 

0 0 O O 1 1 1 

1 0 0 0 1 0 0 

O 1 O O O 1 O 

0 0 1 0 0 0 1 

0 O O 1 O 0 O 

1 0 0 0 0 -1 -1 

O 1 O 0-1 0-1 

O 0 1 0-1-1 o 


^1 ^2 ^3 ^4, yg ^6 yy 

oooooooo 

1 ooooooo 

OlOOOOOO 
OOl OOOOO 
OOOl OOOO 
lOOOlOOO 
OOOOOl OO 
OOOOOOl o 
OOOOOOOl 
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Aft-^r interchanging rows and coluxnns the matrix was reducec 



The basis matrix C is considered. After 
transformations C will become 


1 

o 

o 

0 

0 

o 

0 

o 

0 

0 

1 

o 

o 

0 

o 

0 

o 

o 

o 

o 

1 

o 

0 

o 

0 

o 

o 

0 

o 

o 

1 

0 

o 

0 

o 

o 

0 

0 

o 

o 

1 

o 

o 

o 

o 

0 

o 

o 

o 

•o 

1 

o 

o 

o 

0 

0 

o 

o 

o 

o 

0 

-1 

-1 

0 

0 

0 

o 

0 

o 

-1 

o 

-1 

0 

0 

0 

0 

0 

o 

-1 

-1 

o 


whose determinant is -2. 


some 
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So th© mat^r i x is not, uni modul ar . It, means t,hat, t,h« 
sufficient, condit,ion was violated. Aft,er adding nest,in^ 
const,raint, Cconst,raint, 63 it is not known whether parsing proble* 
can be solved by linear programming. But with assumption 2 the 
problem can be again solved in polynomial time. 

Assun^tion 2: Only mandatory karakas are present, i.e. , there 
are no optional karakas. 

Initially, for a karaka, all the source groups to the 
left of the verb group of that karaka are candidates. After that 
the karaka restrictions and constraints are applied. If a verb 
group has N karakas and all are mandatory then only N source 
groups to the left of that verb group will become candidates, in 
order to satisfy the nesting constraint. 

Example 4: Let '''g the given sentence. If the number* 

of karakas of ar© two and all are mandatory, ng cannot satisfy 
any of the karakas of v^ , because the nesting constraint will be 
violated thus, n^ is not a candidate for any of the karakas of v^. 

Applying this domain restriction algorithm will satisfy 
the nesting constraint. The parsing problem without nesting 
constraint is solvable by min cost flow problem which has; 
polynomial time complexity. So if only mandatory karakas are there 
then the parsing problem is solvable in polynomial time. 

Oaser vati on-1 can be made about the nesting constraint. 
Obsorvatlon-1 : Only optional karakas will take place in the 
nesting constraint C edges of optional karakas will form the 
nesting constraint equatlonsD. 
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Initially, for each karaka t-he source group>s t/O the left. 
of the verb group of that karaka are acceptable candidates. But if 
the verb group has M karakas C both mandatory and optional!) the 
CM+13th source group to the left of verb group cannot become a 
candidate for the karakas, in order to satisfy the nesting 
constraint. 


Exaaqale 5: Let n^^ n^n^n^n^ngV^ Vgbe the input sentence. 

Let Vj^be having 1 snandatory karaka and 2 optional karakas then 

n._cannot become a candidate to the karaka of v, . So n.n_n_are 
o 14 0 0 

the acceptable candidates for karaka of . 

Now the set of elements which are acceptable to a verb 
group are found. Let it be P. Now a subset which satisfy the 
mandatory karaka requirement will be found from the above set. For 
each mandatory karaka, the vibhakti constraint will be applied to 
the source word groups, starting from the first element to the 
left of verb group moving towards left side. Let be the set of 
source word group that satisfy the ith mandatory karaka CFor 


example, is the set of source word group that satisfies the 

first mandatory karaka, say kartaD. The union of all will 


give the set which will satisfy all mandatory karaka requirement, 
k 

S - U M. Ck mandatory karakasD 
i=l ^ 


and S S P. 


The elements of P'fS will be acceptable candidates for 
optional karakas of this verb group or some other verb group 
followed by some other verb group. 
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So, for all mandat,ory kanakas the set of source groups, 
which cannot become candidates for another verb group are formed. 
So the nesting constraint will not have any mandatory karakas. Only 
edges of optional karakas will form the nesting constraint 
equations. 

Another assumption is considered. 

Assunption 3; There are at mast two edges in the graph with 

each optional karaka Ci.e. . at most two source groups can 
become candidates for an optional karaka^. 

With this assumption all the exanqples constructed turn 
out to have uni modular coefficient matrices of integer 
programming. An example is given below. 

EIxaixq:>le 6: Let n^^ n^ Vg be the sentence. 

Let the karaka charts of v. and v^ are. 



Each source group can be a candidate for each karaka. The initial 
graph is given in Fig 3.7. 



Fig 3.7. <^aph of example 6 
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Aft-or t.he donia.in decision procedure t^he graph reduces t-o ■the 
graph in Fig 3. 8. 



Fig 3. 8. Reduced graph of example 6 

Because n. cannot become candidate to v. and n_cannot 
1 13 

become candidate to Vj, the integer programming formulation is 

^B 

^IC 

^IC ^ "'IB 

^ ^ ^B 

^A ^ ^B 

^4A ^4B 

^IB - ^ 

^B - ^IC 

x^j = O or 1. 

and the coefficient matrix of this formulation is unimodular. 
Similar exanqples are formed and no counter example was found yet. 

Assumption 3, made earlier, was satisfied by most of 
the natural language sentences that we checked. 

With assumption 3 whether the coefficient matrix always 
satisfies the unimodular property is also an open problem. 


= 1 

» 1 
< 1 
< 1 
< 1 
< 1 
£ O 
5 O 
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3. 5. CoMplaxitx Asp«ctst 

1. With constraints 1,2 and 3 only the problem was reduced to 
bipartite matching whose tin» complexity is 

o C min -C jU| , jV j > . |E| 5 

2 . With constraints 1 to 4- the problem was reduced to maximum 
bipartite matching whose time complexity is 

o Cn^D where jU|=jVj=n. 

3- with constraints 1 to 5 the problem was reduced to min 
cost flow problem whose complexity is OCC W f-oQT}} CW f TOLo^T})^ 

where n^number of nodes, m= number of edges. 

4. With all 6 constraints the problem was solved by integer 
pr ogr ammi ng. 

The flow chart of the core parser is given in Fig 3.9. 

output of the local 



parsed structure 

Fig 3.0. Flow chart of .the core parser. 
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Th* above specified core parser complexity aspects are 
valid only if the following assumptions are true. 

Assumptions 4: 

1. There are no source word, demand word clashes 

Cin other words there are no category clashes^. 

2. The merged karaka charts can be formed for all senses 
of verb group. 

The assumptions can be relaxed in a controlled fashion. 

3.5.1. Increasing number of merged karaka charts: 

Now the assumption is that only one merged karaka chart 
can be formed. By this the core parser flow chart will be executed 
only once because there will be a unique demand group. If more 
than one karaka charts are there for each verb group, the total 
number of times the core parser flowchart executed will increase. 

If k., ,k_,k^ k are the number of karaka charts for 

12 3 n 

each verb v. ,v_ v then the core parser flow chart will be 

X G ri 

]c 

executed n times. 

If at most a fixed ’k’ number of merged karaka charts are 
formed for each verb group then the coitplexity of problem 
increases but remains to some constant times that of core parser 
complexity because k is bounded. 
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3.5.2 Allowing source word, demand word clashes: 

If all words in the sentence can be source words and 
demand words there will be 2^ of demand word selections, where n 
is the number of words. So the core parser algorithm has to be 
executed exponential times. 

However , a study of the 1 exi cal data gi ves the 
information that 1% of the words in the given sentence have 
clashes. This information is based on the analysis of a kannada 
corpus of a few hundred thousand word^. 

f 
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CHAPTER 4 


CONCLUSIONS 

The machine translation system consists of a 
morphological analyzer^ a local word grouper and' a core parser. 
The con^lexlty of the system is dependant on these three blocks. 
The time complexity of the first two blocks is polynomial. But the 
conq>lexity of the core p>arser is dependant on three aspects. They 
are 

1. Number of category clashes 

8. Number of karaka chajrts associated with each verb 

group. 

3. Constraints. 

The first two aspects are mainly lexicon dependant. But 
the constraints play an important role in the con^lexity of the 
core parser. With some of the constraints the problem will remain 
polynomial CtimeD Cas in the case with constraints 1 to SZ> . But 
with the inclusion of the island constraint C6 th constraints the 
complexity of the problem is not known exactly. The problem was 
reduced to the integer programming problem. 

The parser should gl've unique parse structure. Other 
wise more than one translations will be produced for a given 
sentence. The reason for the ambiguous parse structures is the 
idiosyncrasies in the language like word overloading of words 
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which giv* catogory clashes. 

Extensions can bo done bo the present work in the 
following directions. 

1. constraints: Presently there are 6 constraints in the 
system. It has to be studied whether nx>re constraints are required 
to get a unique p>arse structure, if so what is the con^lexity of 
the system 

2. With the Island constraint the parsing problem was 
reduced to integer programming. But the parsing problem complexity 
with the island constraint is not known. It has to be studied 
whether the problem is p>olynomial tlnm complex or NP-complote or 
NP hard. 
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Appendix 


Dcvanagiri- Roman alphabet rnaj^ping 


aAl IuUqeEo^OMHTr 

k K 8 G f 

cCJJp- 

^ ^ ol 

t T d D 'H 

^ ^ ^ 

u W DC X n 

n ^ ^ ^ 

p P b B m 

^ ^ ^ H 

y r 1 V 

S R s h 


Examples! rAma kqRHa JFAna Sawru AzKa yakRa 

^ 3IFT ■ :^rer uw 



